llama model
- Africa > Ghana (0.05)
- North America > United States > Pennsylvania > Lackawanna County > Scranton (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Europe > United Kingdom (0.04)
- Leisure & Entertainment (0.68)
- Health & Medicine (0.68)
- Government > Regional Government > North America Government > United States Government (0.47)
- Media > Music (0.46)
- Europe > Austria > Vienna (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.95)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
ConfGuard: A Simple and Effective Backdoor Detection for Large Language Models
Wang, Zihan, Zhang, Rui, Li, Hongwei, Fan, Wenshu, Jiang, Wenbo, Zhao, Qingchuan, Xu, Guowen
Backdoor attacks pose a significant threat to Large Language Models (LLMs), where adversaries can embed hidden triggers to manipulate LLM's outputs. Most existing defense methods, primarily designed for classification tasks, are ineffective against the autoregressive nature and vast output space of LLMs, thereby suffering from poor performance and high latency. To address these limitations, we investigate the behavioral discrepancies between benign and back-doored LLMs in output space. We identify a critical phenomenon which we term sequence lock: a backdoored model generates the target sequence with abnormally high and consistent confidence compared to benign generation. Building on this insight, we propose ConfGuard, a lightweight and effective detection method that monitors a sliding window of token confidences to identify sequence lock. Extensive experiments demonstrate ConfGuard achieves a near 100% true positive rate (TPR) and a negligible false positive rate (FPR) in the vast majority of cases. Crucially, the ConfGuard enables real-time detection almost without additional latency, making it a practical backdoor defense for real-world LLM deployments.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
BioCoref: Benchmarking Biomedical Coreference Resolution with LLMs
Salem, Nourah M, White, Elizabeth, Bada, Michael, Hunter, Lawrence
Coreference resolution in biomedical texts presents unique challenges due to complex domain-specific terminology, high ambiguity in mention forms, and long-distance dependencies between coreferring expressions. In this work, we present a comprehensive evaluation of generative large language models (LLMs) for coreference resolution in the biomedical domain. Using the CRAFT corpus as our benchmark, we assess the LLMs' performance with four prompting experiments that vary in their use of local, contextual enrichment, and domain-specific cues such as abbreviations and entity dictionaries. We benchmark these approaches against a discriminative span-based encoder, SpanBERT, to compare the efficacy of generative versus discriminative methods. Our results demonstrate that while LLMs exhibit strong surface-level coreference capabilities, especially when supplemented with domain-grounding prompts, their performance remains sensitive to long-range context and mentions ambiguity. Notably, the LLaMA 8B and 17B models show superior precision and F1 scores under entity-augmented prompting, highlighting the potential of lightweight prompt engineering for enhancing LLM utility in biomedical NLP tasks.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Colorado (0.04)
Evolution of meta's llama models and parameter-efficient fine-tuning of large language models: a survey
Abdullah, Abdulhady Abas, Zubiaga, Arkaitz, Mirjalili, Seyedali, Gandomi, Amir H., Daneshfar, Fatemeh, Amini, Mohammadsadra, Mohammed, Alan Salam, Veisi, Hadi
This review surveys the rapid evolution of Meta AI's LLaMA (Large Language Model Meta AI) series - from LLaMA 1 through LLaMA 4 and the specialized parameter-efficient fine-tuning (PEFT) methods developed for these models. We first describe the LLaMA family of foundation models (7B-65B to 288B parameters), their architectures (including native multimodal and Mixtureof-Experts variants), and key performance characteristics. We then describe and discuss the concept of PEFT, which adapts large pre-trained models by updating only a small subset of parameters, and review five PEFT methods that have been applied to LLaMA: LoRA (Low-Rank Adaptation), LLaMA-Adapter V1 and V2, LLaMA-Excitor, and QLoRA (Quantized LoRA). We discuss each method's mechanism, parameter savings, and example application to LLaMA (e.g., instruction tuning, multimodal tasks). We provide structured discussion and analysis of model and adapter architectures, parameter counts, and benchmark results (including examples where fine-tuned LLaMA models outperform larger baselines). Finally, we examine real-world use cases where LLaMA-based models and PEFT have been successfully applied (e.g., legal and medical domains), and we discuss ongoing challenges and future research directions (such as scaling to even larger contexts and improving robustness). This survey paper provides a one-stop resource for ML researchers and practitioners interested in LLaMA models and efficient fine-tuning strategies.
- Asia > Middle East > Iraq > Kurdistan Region (0.04)
- Oceania > Australia (0.04)
- North America > United States > Virginia (0.04)
- (2 more...)
- Overview (1.00)
- Research Report > Promising Solution (0.67)
- Law > Statutes (1.00)
- Law > Business Law (1.00)
- Information Technology > Security & Privacy (1.00)
- (5 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Europe > Austria > Vienna (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.95)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > Middle East > Jordan (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)
A API Details
API calls for each position identified in a piece of text. Question Answering We use the Atlas model of Izacard et al. (2022) finetuned on Natural Questions Calculator Our calculator is based on a simple Python script and only supports the operators " It does not return any result for syntactically invalid equations. "=", "equals", "equal to", "total of", "average of" followed by a number, or (iii) contain at least three English text before generating API calls. Below, we list the prompts used to sample API calls for each tool considered. Your task is to add calls to a Question Answering API to a piece of text. Input: Joe Biden was born in Scranton, Pennsylvania. Output: Joe Biden was born in [QA("Where was Joe Biden born?")] Scranton, [QA("In Output: Coca-Cola, or [QA("What other name is Coca-Cola known by?")] Coke, is Your task is to add calls to a Calculator API to a piece of text.
- North America > United States > Pennsylvania > Lackawanna County > Scranton (0.24)
- Africa > Ghana (0.05)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Europe > United Kingdom (0.04)
- Government > Regional Government > North America Government > United States Government (1.00)
- Leisure & Entertainment (0.68)
- Health & Medicine (0.68)
Influence-driven Curriculum Learning for Pre-training on Limited Data
Schoenegger, Loris, Thoma, Lukas, Blevins, Terra, Roth, Benjamin
Curriculum learning, a training technique where data is presented to the model in order of example difficulty (e.g., from simpler to more complex documents), has shown limited success for pre-training language models. In this work, we investigate whether curriculum learning becomes competitive if we replace conventional human-centered difficulty metrics with one that more closely corresponds to example difficulty as observed during model training. Specifically, we experiment with sorting training examples by their \textit{training data influence}, a score which estimates the effect of individual training examples on the model's output. Models trained on our curricula are able to outperform ones trained in random order by over 10 percentage points in benchmarks, confirming that curriculum learning is beneficial for language model pre-training, as long as a more model-centric notion of difficulty is adopted.
- Europe > Austria > Vienna (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Singapore (0.05)
- (6 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Scaling behavior of large language models in emotional safety classification across sizes and tasks
Pinzuti, Edoardo, Tüscher, Oliver, Castro, André Ferreira
Understanding how large language models (LLMs) process emotionally sensitive content is critical for building safe and reliable systems, particularly in mental health contexts. We investigate the scaling behavior of LLMs on two key tasks: trinary classification of emotional safety (safe vs. unsafe vs. borderline) and multi-label classification using a six-category safety risk taxonomy. To support this, we construct a novel dataset by merging several human-authored mental health datasets (> 15K samples) and augmenting them with emotion re-interpretation prompts generated via ChatGPT. We evaluate four LLaMA models (1B, 3B, 8B, 70B) across zero-shot, few-shot, and fine-tuning settings. Our results show that larger LLMs achieve stronger average performance, particularly in nuanced multi-label classification and in zero-shot settings. However, lightweight fine-tuning allowed the 1B model to achieve performance comparable to larger models and BERT in several high-data categories, while requiring <2GB VRAM at inference. These findings suggest that smaller, on-device models can serve as viable, privacy-preserving alternatives for sensitive applications, offering the ability to interpret emotional context and maintain safe conversational boundaries. This work highlights key implications for therapeutic LLM applications and the scalable alignment of safety-critical systems.
- Europe > Germany > Rheinland-Pfalz > Mainz (0.05)
- Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Freising (0.04)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.71)
- Health & Medicine > Consumer Health (0.47)